Topic Representation of Researchers' Interests in a Large-Scale Academic Database and Its Application to Author Disambiguation

نویسندگان

  • Marie Katsurai
  • Ikki Ohmukai
  • Hideaki Takeda
چکیده

It is crucial to promote interdisciplinary research and recommend collaborators from different research fields via academic database analysis. This paper addresses a problem to characterize researchers’ interests with a set of diverse research topics found in a large-scale academic database. Specifically, we first use latent Dirichlet allocation to extract topics as distributions over words from a training dataset. Then, we convert the textual features of a researcher’s publications to topic vectors, and calculate the centroid of these vectors to summarize the researcher’s interest as a single vector. In experiments conducted on CiNii Articles, which is the largest academic database in Japan, we show that the extracted topics reflect the diversity of the research fields in the database. The experiment results also indicate the applicability of the proposed topic representation to the author disambiguation problem. key words: researcher analysis, academic database, topic model, author disambiguation

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Crisis of Representation in Azadeh Khanoom and Her Author by Reza Baraheni

The crisis of representation is a topic widely discussed in critique and theory of postmodern literature. This refers to the crises of the present era including the crisis of meaning, the perplexity of contemporary humankind amidst a mass of valid and invalid data, alienation, etc. Literature, as the epitome of human life, is a reflection of these crises in the contemporary era. Azadeh Khanoom ...

متن کامل

Local gradient pattern - A novel feature representation for facial expression recognition

Many researchers adopt Local Binary Pattern for pattern analysis. However, the long histogram created by Local Binary Pattern is not suitable for large-scale facial database. This paper presents a simple facial pattern descriptor for facial expression recognition. Local pattern is computed based on local gradient flow from one side to another side through the center pixel in a 3x3 pixels region...

متن کامل

بهبود صحت ابهام‌زدایی نام نویسنده با استفاده از خوشه‌بندی تجمّعی

Today, digital libraries are important academic resources including millions of citations and bibliographic essential information such as titles, author's names and location of publications. From the view of knowledge accumulation management, the ability to search fast, accurate, desired contents, has a great importance. The complexity and similarity in these resources cause many challenges and...

متن کامل

"Seed+Expand": A validated methodology for creating high quality publication oeuvres of individual researchers

The study of science at the individual micro-level frequently requires the disambiguation of author names. The creation of author’s publication oeuvres involves matching the list of unique author names to names used in publication databases. Despite recent progress in the development of unique author identifiers, e.g., ORCID, VIVO, or DAI, author disambiguation remains a key problem when it com...

متن کامل

Scaling production and improving efficiency in DEA: an interactive approach

DEA models help a DMU to detect its (in-)efficiency and to improve activities, if necessary. Efficiency is only one economic aim for a decision-maker; however, up- or downsizing might be a second one. Improving efficiency is the main topic in DEA; the long-term strategy towards the right production size should attract our attention as well. Not always the management of a DMU primarily focuses o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEICE Transactions

دوره 99-D  شماره 

صفحات  -

تاریخ انتشار 2016